Back

Frontiers in Molecular Biosciences

Frontiers Media SA

Preprints posted in the last 7 days, ranked by how well they match Frontiers in Molecular Biosciences's content profile, based on 100 papers previously published here. The average preprint has a 0.14% match score for this journal, so anything above that is already an above-average fit.

1
Development and Prospective Validation of Predictive Model for Early Hemodynamic Deterioration in Critical Care: A Multicenter Study

Nagori, A.; Singh, P.; Firdos, S.; Devadiga, A.; Vats, V.; Gupta, A.; Bandhey, H.; Ailavadi, P.; Awasthi, R.; Narotam, N.; Mishra, A.; Lodha, R.; Sethi, T.

2026-06-10 intensive care and critical care medicine 10.64898/2026.06.05.26353765 medRxiv
Top 4%
0.9%
Show abstract

High-frequency physiological monitoring in ICUs can identify impending deterioration hours before clinical recognition yet extracting reliable early-warning signals from noisy vital-sign streams remains challenging. We present SIgnose, an interpretable prediction framework for early detection of abnormal shock index (SI), built from routinely monitored vital signs using physiologic variability and nonlinear time-series features. SIgnose was developed on the eICU Collaborative Research Database and externally validated on the MIMIC-III adult database and a pediatric SafeICU cohort (AIIMS New Delhi), with additional prospective validation in the pediatric ICU. We benchmarked three representation strategies: (i) engineered physiologic variability and nonlinear time-series features, (ii) deep learning, and (iii) Llama-3.1-8B embeddings with low-rank adaptation. Physiologic variability features consistently demonstrated superior cross-cohort generalization. The final model used 3,970 features from five vital signs to predict abnormal SI up to 8 hours ahead, achieving AUROC 0.861 (95% CI 0.859-0.863) and AUPRC 0.927 (95% CI 0.925-0.929) on eICU. External validation yielded AUROC 0.870 (95% CI 0.863-0.876) and AUPRC 0.935 (95% CI 0.930-0.940) on MIMIC-III, and AUROC 0.875 (95% CI 0.863-0.888) and AUPRC 0.915 (95% CI 0.898-0.930) on SafeICU; prospective pediatric validation (n = 88) achieved AUROC 0.885 (95% CI 0.868-0.902) and AUPRC 0.911 (95% CI 0.882-0.936). SHAP interpretability analysis identified heart rate variability, respiratory trend dynamics, and multi-scale blood pressure variability as key early-warning signatures. These findings establish SIgnose as a reproducible, low-compute, early-warning framework and demonstrate that physiologic variability features provide robust, generalizable representations for early deterioration detection across adult and pediatric critical care.

2
Incremental Clinical Value of Single-Molecule Nanopore Sequencing in Thalassemia Testing: A Prospective Double-blind, Multicenter Study

Xiang, J.; Zhu, B.; Xu, H.; Chen, Y.; Sun, X.; xiang, r.; Zhao, Y.; Liu, W.; Zhang, L.; He, J.; liu, j.; Chen, Y.; Fan, Z.; Zhang, H.; Tan, J.; Pang, L.; Shi, L.; Kong, Y.; Cai, A.

2026-06-09 hematology 10.64898/2026.06.09.26354559 medRxiv
Top 4%
0.8%
Show abstract

Background Thalassemia is one of the most common monogenic disorders worldwide, current screening strategies combining hematological testing with molecular assays still carry a risk of missed diagnoses and undesirable efficiency, particularly for complex structural variants and rare mutations. Methods In this prospective double-blind, multicenter cohort study of 3,842 participants (3,362 pregnant women and 480 male partners), we conducted a head-to-head comparison to systematically evaluate the incremental clinical value and detection performance of single-molecule nanopore sequencing in thalassemia (SMITH) against conventional hematological testing and next-generation sequencing (NGS). Findings The overall concordance rate between NGS and SMITH was 98.6% (3789/3842). The discrepant cases (n=53) were directly attributed to the superior detection capabilities of SMITH, which successfully identified complex structural rearrangements-including 45 -globin gene triplications and four HK alleles-that were missed by NGS. Furthermore, SMITH accurately detected four rare variants (c.134_135insT/, c.-22(C>T)/, {beta}N/{beta}c.316-290delinsAGGGCAATAATTT and {beta}3.5 kb deletion/{beta}N ) and resolved ten trans and three cis configurations within the globin gene allele. Clinically, these technical advantages translated to a 9.3% (5/54) increase in the detection rate of high-risk prenatal couples, effectively preventing one birth affected by moderate-to-severe thalassemia. Additionally, SMITH corrected a diagnostic discrepancy in one case (HK vs. -3.7), sparing the couple from an unnecessary invasive procedure. Interpretation Our findings demonstrate that SMITH provides a powerful platform for resolving globin gene rearrangements, detecting rare variants, and enabling direct haplotype phasing. By effectively eliminating diagnostic blind spots, SMITH is expected to become an optimal method for thalassemia prevention programs. Funding This study was supported by Chinese National Natural Science Foundation Projects 81760037 and 82271894.

3
Global population frequencies of NAT2 star alleles observed in three large biobanks

Sangkuhl, K.; Whirl-Carrillo, M.; Woon, M.; Venkatesh, R.; Keat, K.; Whaley, R.; Ritchie, M. D.; Klein, T. E.

2026-06-11 genetic and genomic medicine 10.64898/2026.06.09.26355281 medRxiv
Top 4%
0.8%
Show abstract

NAT2 is an important pharmacogene which encodes the N-acetyltransferase 2 enzyme that is involved in the metabolism of multiple medications, and variants in this gene can affect patient response to these medications. CPIC has published a clinical guideline for prescribing hydralazine using NAT2 genotypes. Just prior to the guideline, updated NAT2 star allele numbering and definitions were released, differing somewhat from the historical nomenclature. Clinical pharmacogenomic testing panels often test for the most common star alleles, so knowledge of the most common updated NAT2 star alleles is critical for the implementation of the CPIC NAT2/hydralazine guideline. We first determine NAT2 diplotype frequencies from UK Biobank (UKBB) 200k phased genomes, then analyzed allele, diplotype, and phenotype population frequencies from the All of Us Research program, PennMedicine BioBank (PMBB) and UKBB 500k datasets. We found that analyzing NAT2 diplotypes from phased data provides critical information for algorithms designed to predict diplotypes from unphased data. We observed that NAT2*5, *6, and *4 were the most common star alleles in that order, and the top 11 most frequent NAT2 star alleles were the same across all biobanks. However, differences in star allele frequencies across biogeographical populations were observed. The largest difference led to a higher frequency of NAT2 poor metabolizer phenotypes as compared to rapid and intermediate metabolizer phenotypes in all global populations except in the EAS population, where NAT2 poor metabolizers were in the minority.

4
Development of a Novel Blood-Based Assay for Brain-Derived Tau and Its Validation in Traumatic Brain Injury

Balogun, W. G.; Zeng, X.; Nafash, M. N.; Sehrawat, A.; Shi, R.; Svirsky, S. E.; Okonkwo, D. O.; Puccio, A. M.; Karikari, T. K.

2026-06-10 neurology 10.64898/2026.06.05.26354965 medRxiv
Top 4%
0.8%
Show abstract

Brain-derived tau (BD-tau) is an emerging blood-based biomarker for neurodegeneration, yet there are currently limited well validated BD-tau assays available for research and clinical use. To enhance access to this vital biomarker for neurological disorders including traumatic brain injury (TBI), we developed a novel blood-based immunoassay for BD-tau on the ultra-sensitive Quanterix HD-X platform using Single Molecule Array technology. Analytical validation assessed dilution linearity, specificity, precision, detection limits, and spike recovery, each recording robust metrics in agreement with international expert recommendations. The assay demonstrated robust validation metrics, achieving between-run stability of 95% when analyzing aliquots from six independent plasma and serum samples across five analytical runs. It also showed strong dilution linearity when diluted four-fold and achieved over 90% recovery when spiked with cerebrospinal fluid. Next, we evaluated the clinical utility of the assay in cohorts of individuals with traumatic brain injury (TBI), where strong performances were recorded whether using the 2-step or 3-step assay formats ({rho}= 0.94; p < 0.0001). Furthermore, plasma BD-tau distinguished samples from TBI patients based on time from injury and severity (AUC=0.93). Plasma BD-tau differentiated between favorable and unfavorable functional outcomes in the acute-severe group. Our findings underscore the significant potential of the BD-tau assay as a biomarker for TBI in the severe phase.

5
Three-Month Observational Data for the MPS IIIB Sentinel Subject Following AAV9 Mediated Gene Therapy

Ma, X.; Gu, R.; Ma, W.; Xu, Q.; Wang, R.; Wang, W.; Liang, M.; Liu, X.; Yang, X.; Zhuang, L.; Zhang, W.; Zeng, X.; Xu, J.; Xu, X.; Wu, Z.; Xia, Y.; Liu, Y.; Zhou, J.; Zhu, X.; Wang, H.; Dong, Z.; Yang, W.; Dai, Y.; Pan, X.; Li, X.; Wang, Y.; Dong, X.; Wu, X.; Feng, Z.

2026-06-09 neurology 10.64898/2026.06.01.26354386 medRxiv
Top 5%
0.7%
Show abstract

Background: Mucopolysaccharidosis type IIIB (MPS IIIB) is a devastating neurodegenerative lysosomal storage disorder caused by alpha-N-acetylglucosaminidase (NAGLU) deficiency. There is currently no approved therapy. We report the 3-month outcomes of a novel intracerebroventricular (ICV) gene therapy in a child with MPS IIIB. Methods: In an open-label, single-center, investigator-initiated trial (ChiCTR2600121466), a single dose of RDGT-101 (2.0E14; vg of an AAV9 vector encoding human NAGLU) was administered via ICV infusion. Primary outcomes were safety and tolerability. Secondary outcomes included serum NAGLU activity, urinary heparan sulfate (HS) excretion, and neurocognitive function. Exploratory analyses included hematological parameters. Results: The patient achieved serum NAGLU activity (17.06 nmol/mL/hour) approaching that of healthy controls (17.75 {+/-} 1.37 nmol/mL/hour) by Month 3, accompanied by a 58.4% reduction in urinary HS. Clinically, previously severe hand and toe contractures resolved, allowing for full extension. Neurocognitive improvements were observed, including clear articulation, logical conversation, and sustained eye contact. Hematological analyses revealed normalized red blood cell indices and improved iron utilization. No dose-limiting toxicities, serious adverse events, or clinically significant laboratory abnormalities were observed. Conclusions: A single ICV infusion of RDGT-101 was safe and well-tolerated in this patient with MPS IIIB. Early biochemical correction was accompanied by marked improvements in somatic, neurocognitive, and hematological parameters. These findings support further investigation of ICV AAV9 gene therapy for MPS IIIB.

6
Liver biopsy confirms precise and efficient correction of SERPINA1 after in vivo Base Editing in a Patient with Alpha-1 Antitrypsin Deficiency

Krooss, S. A.; Yang, T.; Yuan, Q.; Drick, N.; Sgodda, M.; Held, J.; Behrendt, P.; Hartleben, B.; Koczulla, R.; Ma, X.; Liu, Y.; Wedemeyer, H.; Janciauskiene, S.; Di Donato, N.; Cantz, T.; Wang, E.; Wu, Y.; Hoeper, M.; Xia, Q.; Ott, M.

2026-06-09 genetic and genomic medicine 10.64898/2026.06.01.26354551 medRxiv
Top 5%
0.7%
Show abstract

Background: Alpha-1 antitrypsin deficiency (AATD) caused by the PI*ZZ mutation (Glu342Lys) results in hepatic accumulation of misfolded AAT-Z protein and reduced circulating AAT levels, leading to progressive liver disease and emphysema. Gene correction therapy represents a potentially curative approach by directly correcting the underlying genetic defect. We report the first case of successful hepatic gene correction with early histological and functional assessment. Methods/Case presentation: We report the case of a 66-year-old male patient with PI*ZZ AATD who underwent gene correction therapy within the YOLT-202 phase I/Ia clinical trial (clinical trial.gov ID NCT07193615). Ten weeks post treatment a liver biopsy was performed to re-evaluate pre-existing F2 liver fibrosis as measured by elastography before entering the study. Serum samples allowed functional assessment of the AAT-mediated elastase inhibition. Results: Liver biopsy did not show signs of hepatic inflammation and demonstrated 54% (Sanger) and 57% (Illumina) gene correction rate of the PI*ZZ variant on the DNA level with no bystander edits or off-target effects. Following a transient elevation of transaminases during the early post-treatment period, liver enzymes normalized. Monthly serum AAT measurements demonstrated biologically active and stable therapeutic levels throughout follow-up. Conclusions: This case demonstrates efficient and precise hepatic gene correction without concerning histological alterations and with substantial improvement of functional parameters, supporting the feasibility and safety of gene editing approaches for AATD.

7
Exploratory Assessment of Pulsed-Wave Doppler Representations of Lung Sounds Using Deep Learning: An In-Vitro Phantom Study

Saad, A. A.; Murthi, S. B.; Boctor, E. M.; Teeter, W. A.; Seam, N.

2026-06-10 respiratory medicine 10.64898/2026.06.09.26353787 medRxiv
Top 6%
0.7%
Show abstract

The increasing availability of portable ultrasound systems motivates exploration of novel approaches to respiratory signal assessment. In this in-vitro study, we investigate whether pulsed-wave (PW) Doppler ultrasound can capture structured spectral patterns from replayed lung sound recordings. Digitized respiratory sounds were replayed through a tissue-mimicking ultrasound phantom, generating 1,478 PW Doppler spectral images from recordings associated with healthy subjects and several externally labeled disease categories. Exploratory classification experiments using a ResNet-18 architecture demonstrated that these Doppler representations contain learnable differences under controlled conditions. These findings motivate further investigation into PW Doppler as a potential representation of respiratory acoustics.

8
Metatranscriptomics-Derived Disease Risk Scores as a Preventive, Diagnostic, and Treatment Support Tool

Hu, L.; Bass, M.; Patridge, E.; Molusky, M.; Antoine, G.; Vuyisich, M.; Banavar, G.

2026-06-06 genetic and genomic medicine 10.64898/2026.05.29.26354333 medRxiv
Top 8%
0.4%
Show abstract

Background: Chronic diseases and symptom syndromes often develop after prolonged biological changes that may precede formal diagnosis. RNA-based metatranscriptomics captures active microbial and human gene expression and may provide a functional layer for disease risk evaluation. To address this translational gap, we developed and validated a Disease Risk Score (DRS) framework that integrates metatranscriptome-derived pathway activity scores from stool, saliva, and blood samples, and evaluated its potential clinical utility as an adjunct risk-evaluation tool. Methods: DRS uses disease-specific sets of pathway activity scores derived from stool and saliva microbial functions, stool and saliva microbial taxa, and blood human gene expression. For each disease, 'not optimal' pathway scores are aggregated into a normalized cumulative odds ratio, or cOR, using score-level odds ratios, statistical significance, and literature-supported biological relevance derived from a Development Cohort of 22,369 individuals. A cOR [&ge;] 5 is defined as high risk. Performance is evaluated in an independent Validation Cohort of 15,908 individuals using self-reported diseases as the reference. Disease support requires both significant cOR separation between self-reported and not-reported (Cohen's d [&ge;] 0.2) and risk ratio enrichment of self-reported disease among individuals classified as high risk (95% CI of Risk Ratio > 1). Results: Of 20 initially evaluated diseases, 15 meet the prespecified validation criteria on the independent validation cohort: ADHD, anxiety, chronic fatigue syndrome, depression, GERD, hypertension, inflammatory bowel disease, IBS-C, IBS-D, insomnia, MASLD, obesity, obstructive sleep apnea, Sjogren's syndrome, and type 2 diabetes. Five selected clinical scenarios illustrate how DRS can support clinician-mediated decision making, including IBS subtype reclassification, improved diagnostic acceptance in IBS-D, personalized lifestyle counseling in MASLD and early type 2 diabetes, and diagnostic uncertainty in atypical GERD. Conclusions: DRS is a metatranscriptomics-based risk-stratification framework that aggregates active microbial and human pathway signals into interpretable disease-specific risk estimates across a wide range of disease conditions. Validation against self-reported disease labels in an independent cohort shows significant risk enrichment for each of 15 diseases. DRS is intended as an adjunct to clinical evaluation: a decision support tool in situations where routine care encounters uncertainty, delay, or low patient engagement. Future prospective studies using clinically adjudicated endpoints are needed to assess calibration and clinical outcomes.

9
Genetic Susceptibility to Incisional Hernia: Evaluation of Hernia Polygenic Risk Scores

Pregnall, A. M.; Hornick, M. M.; Broach, R. B.; Judy, R.; DePaolo, J.; Yuan, S.; Levin, M.; Fischer, J. P.; Damrauer, S. M.; Wachtel, H.

2026-06-11 genetic and genomic medicine 10.64898/2026.06.10.26355374 medRxiv
Top 9%
0.4%
Show abstract

Objectives: Incisional hernia (IH) affects 13-30% of people after abdominal surgery, resulting in substantial morbidity and costs. While clinical risk factors have been studied extensively, genomic risk for IH is incompletely understood. We aimed to evaluate the impact of polygenic risk scores (PRS) on IH risk prediction. Methods] We created and evaluated three PRS for abdominal hernia, ventral hernia and latent hernia susceptibility for prediction of IH in an institutional biobank. The primary outcome was defined as the diagnosis or repair of an IH based on ICD-9/10-CM/PCS and CPT codes. Clinical covariates included age, sex, body mass index (BMI), smoking status, index procedure type, and perioperative surgical site infection. A phenome-wide association study (PheWAS) was performed to assess clinical associations with increased PRS. We then tested the ability of the PRS to improve prediction for IH by modeling clinical covariates with and without PRS in patients who underwent abdominal surgery. Model performance was assessed using 10 iterations of 5-fold cross-validation to estimate Brier scores and area under the receiver operating characteristic curve (AUROC), which were compared using cross-model Bayesian analysis of variance. Results: In 55,809 subjects, assessed PRS was significantly associated with incisional, umbilical, and ventral hernia on PheWAS, with 1.19 greater odds of developing IH per 1-SD increase in PRS (95% CI: 1.13-1.25, P \< 0.001). Of 9,909 subjects who underwent qualifying abdominal surgery, 706 developed IH. In this cohort, the latent hernia susceptibility PRS was associated with a 16% increased hazard of developing IH per 1-SD increase (HR 1.16; 95% CI: 1.07-1.26; P \< 0.001). Compared to a predictive model using clinical covariates (Brier score = 0.047, 95% CI: 0.046-0.048; AUROC = 0.660, 95% CI: 0.653-0.666), addition of the PRS showed similar Brier score and AUROC estimates (Brier score = 0.047, 95% CI: 0.046-0.048; AUROC: 0.667, 95% CI: 0.661-0.673) at five years. Cross-model Bayesian analysis demonstrated \>99% probability of practical equivalence when trying to detect a difference of [&ge;] 0.02. Conclusion: All three PRS for hernia were independently associated with IH, suggesting that genomic factors contribute significantly to IH development. However, none of the three PRS meaningfully improved clinical IH risk prediction in patients who underwent abdominal surgery. This suggests that clinical comorbidities and surgical techniques may be equally as important as genomic architecture.

10
Prevalence and Clinical Significance of Adult-Onset Cancer Predisposition Variants in Pediatric Oncology

Maciaszek, J. L.; Pastor Loyola, V.; Cain, T.; Cardenas, M.; Blackburn, P. R.; Wilkinson, M. R.; Koo, S. C.; Wu, C.-H.; Li, C.; Wang, L.; Nichols, K. E.; Klco, J. M.; Eldomery, M. K.

2026-06-08 genetic and genomic medicine 10.64898/2026.06.07.26354365 medRxiv
Top 9%
0.4%
Show abstract

Purpose: Pathogenic or likely pathogenic (P/LP) variants are increasingly identified in genes more commonly associated with adult-onset cancer predisposition, but their prevalence and relevance to children who present with cancer remain unclear. Methods: We retrospectively analyzed 1,280 consecutive pediatric patients with cancer who underwent clinical germline sequencing, using a virtual panel, from 2021 to 2024. Genes with P/LP variants were categorized as aoCPG or pediatric-onset cancer predisposition genes (poCPG) according to cancer risk before age 18 years and pediatric surveillance recommendations. Variant relevance was adjudicated using tumor diagnosis/histopathology, immunohistochemistry, and tumor molecular features and classified as primary, secondary, or indeterminate. Results: Among 1,280 patients, 197 (15.4%) harbored 211 P/LP variants across 54 genes. Sixty-six variants (31.3%) occurred in aoCPG, 87 (41.2%) in poCPG, and 58 (27.5%) were heterozygous variants in autosomal recessive genes. Among adult-onset variants, 7 (10.6%) were primary, 54 (81.8%) secondary, and 5 (7.6%) indeterminate. Among pediatric-onset variants, 77 (88.5%) were primary and 10 (11.5%) secondary. Six patients (3 adult-onset variants; 3 pediatric-onset variants) received targeted therapy informed by germline/somatic sequencing results. Conclusion: In pediatric oncology, most variants in aoCPG are secondary rather than tumor-related findings. Tumor-informed interpretation, beyond variant classification, may improve reporting, counseling, and therapeutic decision-making

11
Foundation model-based tool for automated ulcerative colitis histology scoring demonstrates non-inferiority to pathologists across multiple scoring indices

Tahir, W.; Shamshoian, J.; Tauber, J.; Clinton, L. K.; Griffin, M.; Shah, C.; Singh, G.; Fahy, D.; Sucipto, K.; Brosnan-Cashman, J.; Altepeter, T. A.; Bhattacharya, S.; Crandall, W.; Duan, C.; Gale, J. D.; Gupta, V.; Haarmann, H.; Harpaz, N.; Hooper, A. T.; Horowitz, J.; Hurtado-Lorenzo, A.; Hussaini, B. E.; Jairath, V.; Jones, A.; Kostiuk, B.; Kukreja, A.; Laroux, F. S.; Lissoos, T.; McBride, R. B.; Najdawi, F.; Nayyar, A.; Osterman, M. T.; Panchal, P.; Ruane, D.; Travis, S.; Visvanathan, S.; Wilson, L.; Jayson, C.

2026-06-11 pathology 10.64898/2026.06.09.26355212 medRxiv
Top 10%
0.3%
Show abstract

In clinical trials for ulcerative colitis (UC), pathologists assess disease severity through standardized histological indices, including the Geboes Score, Robarts Histopathology Index (RHI), and Nancy Histologic Index (NHI). Despite strong associations with clinical outcomes, histologic scoring suffers from inter- and intra-reader variability, and consensus criteria for histologic remission remain uncertain. Through a consortium approach, we developed an artificial intelligence-based measurement (AIM) tool for scoring histology in UC mucosal biopsies (AIM-HI UC). This model, trained on a large dataset of UC biopsies (N=10,230), utilizes additive multiple instance learning models leveraging PLUTO, a pathology foundation model, that predict each of the Geboes subgrades, from which the Geboes grade-level score, RHI, and NHI can be calculated. Evaluation of this model on a standalone verification set including clinical trial specimens established algorithm non-inferiority and/or superiority relative to standard qualified pathologists through comparison of algorithm-consensus and pathologist-consensus agreement metrics (non-inferior if difference >-0.1, superior if difference >0, inclusive of confidence intervals). AIM-HI UC was determined to be non-inferior to pathologists (N=3) for the prediction of all seven Geboes subgrades, grade-level Geboes, RHI, NHI, histologic improvement (GS<3.1), 2A histologic remission (GS<2A.0), and 2B histologic remission (GS<2B.0). AIM-HI UC was superior to pathologists for several Geboes subgrades (GS 0, GS 1, GS 2B, and GS 5), as well as grade-level Geboes, RHI, and positive percent agreement of 2A histologic remission. The model was shown to be greater than 99% repeatable for all histologic scoring metrics examined. Model-derived scores were shown to strongly correlate with canonical histologic features of inflammation, including the proportion of total epithelium that is inflamed (Spearman r=0.83; p<0.01), the proportion of neutrophils localized within crypt epithelium (Spearman r=0.83, p<0.01), and the amount of mucosal area classified as erosion or ulceration (Spearman r=0.80, p<0.01). Overall, these results suggest that AIM-HI UC has the potential to improve consistency of UC histology interpretation, providing a path toward standardization of UC histology scoring in clinical trials.

12
Rare neurological and neurodevelopmental variants in ALS link to onset, survival and family history

O'Donoghue, C.; Kacar, E.; Gomes, T.; Costello, E.; Pender, N.; Peelo, C.; Ryan, M.; Heverin, M.; Byrne, S.; Bede, P.; Hardiman, O.; McLaughlin, R. L.; Byrne, R. P.

2026-06-10 genetic and genomic medicine 10.64898/2026.06.09.26354977 medRxiv
Top 11%
0.3%
Show abstract

Background: Neurological, neuropsychiatric, and neurodevelopmental disorders cluster in ALS families, sharing a common genetic architecture with ALS. Pathogenic variants in genes associated with other neurological, neurodevelopmental, or neuropsychiatric disorders may also co-occur in ALS and modify phenotype. We have sought to determine the prevalence and clinical pattern of likely-pathogenic/pathogenic (LP/P) non-ALS neurological, neurodevelopmental, and neuropsychiatric variants, alone and in combination with ALS-gene variants, in two large ALS cohorts. Methods: Whole-genome sequencing (WGS) of 469 Irish and 774 Answer ALS people with ALS (pwALS) was analysed for ClinVar LP/P variants associated with other neurological (n = 15541), neurodevelopmental (n = 9761), and neuropsychiatric (n = 321) phenotypes. Inheritance patterns for associated genes (autosomal recessive/autosomal dominant) along with the associated phenotype were validated using OMIM. Standardised clinical data included family history, site and age of onset, El Escorial category, survival, motor decline, and cognitive and behavioural assessments. Known ALS-gene variants and C9orf72 repeat expansion status were included for each cohort. Results: Non-ALS neurological variants were identified in 47/469 (10.0%) Irish and 69/774 (8.9%) Answer ALS participants, most frequently in hereditary spastic paraplegia-associated genes (3.2% Irish; 2.8% Answer ALS). Irish neurological variant carriers showed higher frequency of respiratory onset (10.6% vs 1.2%, Fisher's exact p = 0.002, {Phi} = 0.20) and fewer premorbid behavioural symptoms (0.92 +/- 0.56 vs 3.08 +/- 0.97, Cohen's d = -0.40). Neurodevelopmental variants occurred in 12/469 (2.6%) Irish and 20/774 (2.6%) Answer ALS participants. In the Irish cohort, neurodevelopmental variant carriers had significantly shorter survival in Cox proportional hazards model (log-rank p = 0.005), corresponding to a more than two-fold increased hazard of death (HR = 2.25, 95% CI 1.26-4.00), and had significantly increased familial burden of neuropsychiatric disorders among first- and second-degree relatives (negative binomial IRR for carriers = 2.41, 95% CI: 1.12-5.18, p = 0.025). Across combined cohorts, 18 individuals (Irish n = 8; Answer ALS n = 10) carried [&ge;]2 LP/P variants spanning ALS and non-ALS genes. Conclusion: Rare LP/P variants in genes associated with other neurological and neurodevelopmental disorders occur in up to 12% of pwALS across two independent cohorts. Carriers show distinct phenotypes, shorter survival, and characteristic family history patterns. These findings suggest that extended pleiotropic and oligogenic architectures may contribute to ALS heterogeneity.

13
More Than Results: A Qualitative Study on the Role of Person-Centered Genetic Counseling in Parkinson Disease Research

Verbrugge, J.; Fiallos, K.; Cook, L.; Miller, M.; Head, K. J.

2026-06-09 genetic and genomic medicine 10.64898/2026.06.03.26354465 medRxiv
Top 15%
0.2%
Show abstract

As genetic testing becomes increasingly integrated into Parkinson disease (PD) research, including targeted testing for variants in LRRK2 and GBA1, the return of individual research results is becoming more common. However, limited qualitative data exists regarding how research participants experience genetic results disclosure and post-test genetic counseling in PD research settings. We conducted semi-structured qualitative interviews with participants (n=13) enrolled in the Parkinson Precision Medicine Initiative (formerly Parkinson Progression Markers Initiative; PPMI) who had received PD-related genetic test results and post-test genetic counseling. Interviews were conducted 1 to 3 weeks following result disclosure and analyzed using thematic analysis with a primarily deductive coding approach informed by study aims and inductive identification of emergent themes. Four primary themes were identified: (1) personal connection and motivations for participation, (2) centrality of result disclosure and information preferences, (3) emotional experiences and support needs, and (4) communication quality and alignment with participant needs. Overall, our findings underscore the importance of person-centered genetic counseling within PD research. As return of genetic and biomarker results in research and clinical trial contexts expand, thoughtful integration of relational, informational, and communication-focused practices will be essential to support participant engagement and trust.

14
Multimodal approach to identify neuropsychophysiological subgroups in myalgic encephalomyelitis/chronic fatigue syndrome and their relevance for rehabilitation: protocol for a mechanistic cross-sectional and longitudinal study

Dooms, Y.; Qiu, L.; Coppieters, I.; Vergaelen, E.; Claes, S.; Dupont, P.; Hehl, M.; Cuypers, K.; Engler, H.; Dombrowski, K.; Verbeke, K.; Van den Bergh, O.; Raes, J.; Van Oudenhove, L.; Van Den Houte, M.; Bogaerts, K.

2026-06-08 neurology 10.64898/2026.06.05.26354983 medRxiv
Top 15%
0.2%
Show abstract

Introduction: Myalgic Encephalomyelitis (ME)/Chronic Fatigue Syndrome (CFS) is a debilitating condition characterised by severe fatigue and post-exertional malaise (PEM). Reported neuropsychophysiological abnormalities suggest ME/CFS is multifactorial, but current knowledge remains fragmented. This study protocol outlines a multimodal investigation designed to (1) compare neuropsychophysiological mechanisms between ME/CFS patients and healthy participants, (2) test an integrative model of ME/CFS, (3) identify neuropsychophysiological subgroups within the patient population, and (4) identify predictors of symptom response during rehabilitation. Methods and analysis: This study will enroll 115 ME/CFS patients and 55 healthy participants. Groups will be comparable in age, sex, and education level, with a larger patient sample enabling subgroup and longitudinal analyses. A cross-sectional assessment at baseline will be carried out in both groups. Patients will then be evaluated longitudinally throughout a standardized cognitive-behavioral therapy rehabilitation program delivered as routine care. Baseline measures include systemic inflammation and general health biomarkers, measures of autonomic and central nervous system function, neuroinflammation (magnetic resonance spectroscopy, [18F]DPA714 PET in a subsample), serum short-chain fatty acid levels, gut microbiota composition and function, and neuroendocrine and self-reported responses to psychosocial stress. Fatigue severity (physical and cognitive) and PEM will be assessed through validated questionnaires, ecological momentary assessment, and laboratory tasks. These will be re-evaluated during therapy, and all non-neuroimaging measures will be repeated after the rehabilitation program. Statistical analyses will comprise multivariate analysis of variance, general linear models, classification algorithms, structural equation models, least absolute shrinkage selection operator principal component regression (LASSO-PCR), cluster analysis and latent class growth analysis (LCGA).

15
Artificial intelligence-assisted ganglion cell detection in Hirschsprung's disease: A comparative evaluation of two deep learning approaches

Wang, E.; Grenier, K.; Savadjiev, P.; Poenaru, D. D.

2026-06-12 pathology 10.64898/2026.06.11.26354826 medRxiv
Top 16%
0.2%
Show abstract

Background. Definitive diagnosis of Hirschsprung's disease (HD) requires pathological identification of enteric ganglion cells. This process is time-consuming and subject to inter-observer variability. Artificial intelligence (AI) tools have the potential to standardize and accelerate this workflow, but no study has determined which AI approach best serves intraoperative HD pathology diagnostics. Method. This study compared the U-Net and You Only Look Once version 26 (YOLO26) frameworks for ganglion cell detection using a single-centre retrospective dataset of 54 whole-slide images (WSIs) from rectal biopsies. WSIs were tiled into 397,731 image patches (128x128 pixels), further partitioned into training (70%), validation (15%), and testing (15%) sets. Models were evaluated on tile- and patient-level diagnostic metrics and processing latency. Results. The U-Net achieved a tile-level sensitivity of 82.9%, showing no statistically significant difference compared to YOLO26 (79.1%; p = 0.097). However, YOLO26 demonstrated a statistically significant advantage in tile-level specificity (96.1% vs. 93.9%; p < 0.001) and reduced mean inference latency (7.64 ms vs. 11.57 ms/tile). At the patient level, both models achieved 100% diagnostic sensitivity. Despite low patient-level specificity (0.0% U-Net; 11.8% YOLO26), the tissue-level diagnostic burden of false positives was 6.00% for U-Net and 3.50% for YOLO26. Conclusion. The U-Net is preferred when nominal gains in sensitivity are prioritized, while the YOLO26 is an alternative that optimizes efficiency and false positive suppression. Both models serve as robust screening filters to augment the pathologist's workflow and should be selected based on workflow requirements. Prospective validation on larger, multi-centre datasets is required before clinical implementation.

16
STELLAR: A flexible ensemble learning framework integrating rare variants to enhance polygenic risk prediction

Chen, T.; Li, X.; Mazumder, R.; Zhang, H.; Lin, X.

2026-06-09 genetic and genomic medicine 10.64898/2026.06.07.26355109 medRxiv
Top 17%
0.2%
Show abstract

Whole-exome and whole-genome sequencing technology has enabled the discovery of rare genetic variants associated with human health and diseases. However, existing statistical methods used for rare variant association testing are not well-suited for building genetic risk prediction models that jointly incorporate rare and common variants. We propose STELLAR, a flexible ensemble learning-based approach to compute rare variant polygenic risk scores (PRS) using association summary statistics to enhance conventional common variant PRS. Our method combines burden-based and penalty-based rare variant analysis and leverages functional annotation information to prioritize potentially causal variants within the prediction models. In simulation studies, PRS using STELLAR consistently showed the highest prediction accuracy compared to models using common variants alone or rare variant burdens. Applied to UK Biobank whole-exome sequencing data (n=310,831) across eight continuous and five binary traits, STELLAR significantly improved prediction accuracy, refined stratification of individuals at the highest genetic risk beyond common variants, and prioritized biologically relevant genes. STELLAR provides a scalable strategy to incorporate rare variants into PRS in addition to common variants, advancing precision risk prediction and enabling more comprehensive assessment of genetic contributions to complex diseases.

17
Dissecting the functional landscape of rare diseases through genomic variation in a heterogeneous cohort of 11,000 patients

Uria-Regojo, G.; Fernandez-Caballero, L.; Lopez-Alcojor, A.; Lopez-Lopez, L.; Benitez, Y.; Rodilla, C.; Avila Fernandez, A.; Trujillo-Tiebas, M. J.; Osorio, A.; Corton, M.; Almoguera, B.; Ayuso, C.; Minguez, P.

2026-06-11 genetic and genomic medicine 10.64898/2026.06.10.26355349 medRxiv
Top 17%
0.2%
Show abstract

Rare diseases (RDs) remain a major diagnostic challenge. Genetic and phenotypic heterogeneity, incomplete knowledge of disease mechanisms, and limitations in variant clinical interpretation leave many patients without a molecular diagnosis. Meanwhile, the growing volume of genomic data generated in clinical practice offers an opportunity to develop data-driven methodologies for exploring disease mechanisms and improving the reanalysis of unsolved cases. We aggregated real-world genomic data from 11,084 unrelated patients with suspected RD. Patients were clinically classified into 122 diseases. We built a multi-disease genomic variant frequency database (FJD-DB), which enabled the development of variant and gene-disease association scores by means of case-control subcohort comparisons across 32 disease groups. Functional enrichment analyses were then used to highlight disease-associated protein domains, pathways, biological processes, and phenotypes. Finally, the resulting knowledge was integrated into a data-driven framework for the guided reanalysis of unsolved RD patients applied to Inherited Retinal Dystrophies (IRD) patients as first use case. FJD-DB contained more than 45 million unique variants, including ~185,000 potentially pathogenic variants. Disease-specific analyses identified disease-associated pathogenic variants and highlighted both established and candidate disease genes. We detected 179 significantly enriched protein domains across 23 diseases, 124 Human Phenotype Ontology terms across 13 diseases, 79 Reactome pathways across 10 diseases, and 72 Gene Ontology biological processes across 8 diseases, revealing highly disease-specific functional signatures. Integration of disease-specific variant, gene, and functional association signals enabled the development of a data-driven framework for guided reanalysis of unsolved RD cases. Applied to more than 1,100 unsolved IRD cases, the framework generated clinically relevant findings in 26 patients, including four molecular diagnoses, seven candidate diagnoses, and 15 cases upgraded from non-informative findings to variants of uncertain significance. Aggregated real-world genomic data can be leveraged to identify disease-associated molecular signals generating novel biological hypotheses. A unified analytical framework provides a scalable strategy for knowledge discovery and guided reanalysis, facilitating the identification of overlooked and potentially novel genetic causes of RDs.

18
Population-scale detection of methylation outliers from long-read genome sequencing

Jensen, T. D.; Kaur, R.; Bonner, D. E.; Nguyen, J.; Reuter, C. M.; Undiagnosed Diseases Network, ; Genomics Research to Elucidate the Genetics of Rare Diseases (GREGoR) Consortium, ; Ashley, E. A.; Bernstein, J. A.; Wheeler, M. T.; Montgomery, S. B.

2026-06-11 genetic and genomic medicine 10.64898/2026.06.09.26355279 medRxiv
Top 18%
0.2%
Show abstract

Background: Aberrant DNA methylation can mediate the functional effects of rare genetic variation and contribute to imprinting disorders, repeat expansion diseases, and other pathogenic regulatory mechanisms. Long-read sequencing technologies now enable genome-wide detection of CpG methylation alongside genetic variation from a single assay. However, methods for systematic identification and interpretation of methylation outliers from long-read sequencing data remain limited. Methods: We developed METAFORA, a computational workflow for detecting methylation outlier regions from PacBio and Oxford Nanopore long-read sequencing data. METAFORA constructs population-level methylation references, segments the genome into correlated CpG blocks, infers technical and biological sources of variation through hidden factor estimation, models uncertainty due to variable depth sequencing, and computes covariate-adjusted methylation outlier scores for individual samples. We applied METAFORA across large long-read sequencing cohorts and integrated methylation outliers with multi-omic data. METAFORA is implemented as a snakemake workflow available at https://github.com/tjense25/METAFORA. Results: METAFORA identified methylation outlier regions associated with rare structural variants, tandem repeat expansions, and imprinting abnormalities. We found outlier regions were enriched for molecular outliers across transcriptomic and chromatin accessibility datasets, supporting their functional relevance in gene regulation. In a representative case, METAFORA identified an imprinting defect affecting the GNAS locus associated with an STX16 deletion. Conclusions: METAFORA enables scalable detection and interpretation of methylation outliers from long-read sequencing data and provides a framework for integrating epigenetic outliers with genomic and multi-omic analyses. These approaches may improve interpretation of rare regulatory variation and support discovery of clinically relevant epigenetic abnormalities in genomic medicine.

19
Multi-ancestry analysis of POLG variants in Parkinson's disease

Tay, Y. W.; Elsayed, I.; Yeow, D.; James, M.; Kung, P.-J.; Screven, L.; Dilliott, A. A.; Alcalay, R. N.; Fang, Z.-H.; Tan, A. H.; Global Parkinson's Genetics Program (GP2), ; Sue, C. M.; Lange, L. M.; Perinan, M. T.

2026-06-08 genetic and genomic medicine 10.64898/2026.06.07.26354811 medRxiv
Top 19%
0.1%
Show abstract

Introduction: Variants in the polymerase gamma (POLG) gene are associated with a wide range of mitochondrial disorders. Emerging evidence suggests a potential link between POLG variants and Parkinson's disease (PD); yet, results remain inconclusive. Objectives: To investigate the genetic spectrum and prevalence of POLG variants in PD across diverse ancestries. Methods: We leveraged multi-ancestry genetic data from the Global Parkinson's Genetics Program (GP2), including genotyping data from 98,589 and short-read sequencing data from 36,022 individuals. We performed a POLG rare variant screen, case-control association, and gene-level burden analyses. Results: Five PD cases carried potentially biallelic rare pathogenic/likely pathogenic POLG variants. Additionally, 228 individuals (<1%; 161 PD cases, 28 individuals with other neurological disorders, and 39 controls) carried 34 distinct rare pathogenic/likely pathogenic heterozygous variants, with no significant frequency differences between cases and controls, except for the p.Ala467Thr variant in the European population. The co-inherited pathogenic variants p.Thr251Ile and p.Pro587Leu were present in <1% of both cases and controls, with no significant group differences. Burden and variant-level association analyses showed no association between rare POLG variant burden or common POLG variant enrichment and PD. Conclusions: POLG variants are overall rare in PD. The identification of rare pathogenic variants among PD cases suggests that POLG-related mitochondrial dysfunction may contribute to PD in isolated instances, particularly under recessive inheritance. Our findings support a role for POLG variants in select cases and underscore the need for larger-scale sequencing and functional studies.

20
Conversational Speech for Respiratory Triage in Primary Care: A Pilot Study

Ravi, V.; Noufi, C.

2026-06-11 respiratory medicine 10.64898/2026.06.09.26355284 medRxiv
Top 21%
0.1%
Show abstract

Background. Respiratory complaints account for a substantial share of adult ambulatory care visits, and triaging them accurately has direct consequences for antibiotic stewardship and pathogen-specific therapy. Prior work has investigated voice as a triage signal, but that literature is dominated by single-condition detection from scripted speech in crowdsourced or controlled clinical settings and has not been evaluated at primary care scale on conversational ambient audio. Methods. A dataset of 514,377 ambient-recorded primary care visits from 379,225 adult patients at a US clinic network was used, with per-visit clinically assigned ICD-10 diagnosis codes and de-identified demographic and geographic metadata. Patient audio was extracted from each doctor-patient conversation, and spectral, voice quality, and prosodic features were computed. Eleven binary classification tasks were defined, aligned with a respiratory triage cascade (e.g., acute respiratory versus acute non-respiratory illness, and lower versus upper respiratory tract infection). An acoustic model (feed-forward network) was trained independently for each task using patient-stratified five-fold cross-validation and evaluated on a held-out test set. Each task's model was also compared against six non-acoustic baselines using a single demographic, geographic, or temporal variable. The 11 trained classifiers were composed into a hierarchical cascade and illustrated as case studies on selected patients. Results. Test-set AUC across the 11 tasks ranged from 0.602 (95% CI: 0.588-0.614) to 0.745 (95% CI: 0.742-0.748), with a mean expected calibration error of 0.018. Six of eleven binaries outperformed all confounder baselines. Four binaries showed median within-stratum AUC of 0.62-0.70 when the confounder was held fixed, indicating acoustic discrimination beyond what the confounder alone explains. The exception was the pneumonia versus non-pneumonia lower respiratory tract infection binary, which failed against the patient-city confounder baseline, plausibly reflecting a clinic-level difference in ICD-10 coding. Conclusion. Conversational primary care audio carries acoustic signal that discriminates clinically meaningful respiratory contrasts. Absolute performance is moderate, but the conditions are stricter than prior work: conversational speech and differential-diagnosis contrasts among sick patients. This pilot study is a baseline for voice-based clinical AI moving beyond sick-versus-healthy detection toward differential-diagnosis panels and a proof-of-concept for hierarchical reasoning.